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McShane and Wyner (2011) (henceforth MW) analyze a dataset of "proxy" 
climate records previously used by Mann et al. (2008) (henceforth M08) to 
attempt to assess their utility in reconstructing past temperatures. MW 
introduce new methods in their analysis, which is welcome. However, the 
absence of both proper data quality control and appropriate "pseudoproxy" 
tests to assess the performance of their methods invalidate their main con- 
clusions. 

We deal first with the issue of data quality. In the frozen 1000 AD net- 
work of 95 proxy records used by MW, 36 tree-ring records were not used 
by M08 due to their failure to meet objective standards of reliability. These 
records did not meet the minimal replication requirement of at least eight 
independent contributing tree cores (as described in the Supplemental In- 
formation of M08). That requirement yields a smaller dataset of 59 proxy 
records back to AD 1000 as clearly indicated in M08. MW's inclusion of the 
additional poor-quality proxies has a material aff'ect on the reconstructions, 
inflating the level of peak apparent Medieval warmth, particularly in their 
featured "OLS PCIO" {K = 10 PCs of the proxy data used as predictors of 
instrumental mean NH land temperature) reconstruction. The further elim- 
ination of four potentially contaminated "Tiljander" proxies [as tested in 
M08; M08 also tested the impact of removing tree-ring data, including con- 
troversial long "Bristlecone pine" tree-ring records. Recent work [cf. Salzer 
et al. (2009)], however, demonstrates those data to contain a reliable long- 
term temperature signal], which yields a set of 55 proxies, further reduces 
the level of peak Medieval warmth (Figure 1(a); cf. Figure 14 in MW; see 
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also Supplementary Figures S1-S2 [Schmidt, Mann and Rutherford (2011a, 
2011b)]). 

The MW "OLS PCIO" reconstruction has greater peak apparent Medieval 
warmth in comparison with M08 or any of a dozen similar hemispheric tem- 
perature reconstructions [Jansen et al. (2007)]. That additional warmth, 
as shown above, largely disappears with the use of the more appropriate 
dataset. Using their reconstruction, MW nonetheless still found recent warmth 
to be unusual in a long-term context: they estimate an 80% probability that 
the decade 1997-2006 is warmer than any other for at least the past 1000 
years. Using the more appropriate 55-proxy dataset with the same {K = 10) 
estimation procedure, we calculate a higher probability of 86% that recent 
decadal warmth is unprecedented for the past millennium [Figure 1(b)]. 

However K = 1Q principal components is almost certainly too large, and 
the resulting reconstruction likely suffers from statistical over-fitting. Ob- 
jective selection criteria applied to the M08 AD 1000 proxy network (see 
Supplementary Figure S4), as well as independent "pseudoproxy" analyses 
discussed below, favor retaining only K = A ("OLS PC4" in the terminology 
of MW). Using this reconstruction, we observe a very close match [e.g.. Fig- 
ure 1(a)] with the relevant M08 reconstruction and we calculate considerably 
higher probabilities up to 99% that recent decadal warmth is unprecedented 
for at least the past millennium [Figure 1(c)]. These posterior probabilities 
imply substantially higher confidence than the "likely" assessment by M08 
and IPCC (2007) (a 67% level of confidence). Indeed, a probability of 99% 
not only exceeds the IPCC "very likely" threshold (90%), but reaches the 
"virtually certain" (99%) threshold. However, since these posterior proba- 
bilities do not take into account potential systematic issues in the source 
data, are sensitive to methodological choices, and vary by a few percent de- 
pending on the MCMC realization, we maintain that a "likely" conclusion 
is most consistent with the balance of evidence [IPCC (2007)]. 

There are additional methodological weaknesses in the techniques em- 
ployed by MW that require discussion. MW mix incommensurate (decadal 
vs. annual resolution) proxy data in their procedure, a problem that is 
avoided by the "hybrid" frequency band calibration method used by M08. 
Using a version of the proxy data that was consistently low-pass filtered 
to retain only decadal features shows even better agreement with the M08 
reconstruction (supplementary Figure S3). 

Furthermore, methods using simple Ordinary Least Squares (OLS) regres- 
sions of principal components of the proxy network and instrumental data 
suffer from known biases, including the underestimation of variance [see, 
e.g., Hegerl et al. (2006)]. The spectrally "red" nature of the noise present 
in proxy records poses a particular challenge [e.g., Jones et al. (2009)]. A 
standard benchmark in the field is the use of synthetic proxy data known 
as "pseudoproxies" derived from long-term climate model simulations where 
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(a) Impact of proper proxy selection + no Tiljander 
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Fig. 1. Reconstructions of mean Northern Hemisphere land temperatures over the past 
millennium for various methodological choices (cf. MW Figure 14). (a) Results using the 
M08 frozen AD 1000 network of 59 minus 4 "Tiljander" proxy records (corresponding re- 
sults based on all 59 records are shown in Supplementary Figure SI). Shown for comparison 
are the original MW results and the Mann et al. (2008) "EIV" decadal "CRU" NH land 
temperature reconstruction based on the identical proxy data. The OLS reconstructions 
have been filtered with a loess smoother ('span = 0.05^ to emphasize low-frequency (greater 
than 50 year) variations. Associated annual reconstructions are shown in Supplementary 
Figure S2. (b) Comparison of Monte Carlo ensemble (and mean) reconstructions using 
"OLS PCIO" as in MW Figure 16. Labeled reconstructions are in color, grey lines are the 
total set of MW reconstructions after allowing for uncertainties in the coefficients. 
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(c) Impact of proper proxy selection/No Tilj/OLS PC4 
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Fig. 1. (c) As m (b) above but instead using "OLS PC'4-" 

the true climate history is known, and the skill of the particular method can 
be evaluated [see, e.g., Mann et al. (2007); Jones et al. (2009) and numerous 
references therein]. (We note that the term "pseudoproxy" was misused in 
MW to instead denote various noise models.) In contrast to the MW claim 
that their methods perform "fairly similarly," these tests show dramatic dif- 
ferences in model performance (Figure 2). Indeed, the various flavors of OLS 
and, particularly, the "Lasso" method (used only in the first half of MW), 
suffer from serious underestimation biases in comparison with, for example, 
the hybrid RegEM approach of M08 (see also Table SI). 

Taken together, these points demonstrate that any conclusions regarding 
the utility of proxies in reconstructing past climate drawn by MW were, 
at best, overstated. Assessing the skill of methods that do not work well 
(such as Lasso) and concluding that no method can therefore work well, is 
logically flawed. Additional problems exist in their assessment procedure — 
reducing the size of the hold out periods to 30 years from 46 years in M08, 
for instance, makes it more difficult to meaningfully diagnose statistical skill. 

Problems in climate research, such as statistical climate reconstruction, 
require sophisticated statistical approaches and a thorough understanding 
of the data used. Moreover, investigations of the underlying spatial patterns 
of past climate changes, rather than simply hemispheric mean temperature 
estimates, are most likely to provide insights into climate dynamics [e.g., 
Mann et al. (2009), Schmidt (2010)]. Further progress in this area will most 
likely arise from continuing collaboration between the statistics and climate 
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(a) Pseudo-proxy test (N=59): CSM (b) Pseudo-proxy test (N=59): GKSS 
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Fig. 2. Pseudoproxy tests of reconstruction methodologies used by MW and compari- 
son with the hybrid and nonhybrid RegEM EIV methods used by M08. The pseudoproxy 
networks are defined by a randomly selected set of gridboxes using two different coupled 
ocean-atmosphere general circulation model (OAGCM) simulations subjected to estimated 
natural and anthropogenic forcing over the past millennium. Pseudoproxies are constructed 
assuming "red" proxy noise /AR(1) with p — 0.32] yielding mean signal-to-noise amplitude 
ratio of SNR = 0.4, characteristics which are consistent with estimates from actual proxy 
data networks [see Mann et al. (2007)]. All reconstructions use a calibration interval of 
1856-1980. Figure shows results for a 59-location network including (a) NCAR CSM and 
(b) GKSS simulations and a network with 104 locations for (c) CSM and (d) CKSS. 
Labeled reconstructions are in color, grey lines are the total set of MW reconstruction 
techniques. Note that uncertainties are reduced for the larger network, where the underes- 
timation bias becomes negligible for the hybrid RegEM EIV method. 

science communities, such as fostered since 1996 by the joint NSF/NCAR 
Geophysical Statistics Project. 
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Supplementary figures and tables, data used, and scripts for perform- 
ing all analyses are all available at: http://www.meteo.psu.edu/~niann/ 
supplement s/AOAS/ 

SUPPLEMENTARY MATERIAL 

Supplement A: Supplemental figures (DOI: 10. 1214/10- AOAS398DSUPPA; 
.pdf). Additional figures Sl-4 and Table SI. 
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Supplement B: Code and data for producing all figures and results in the 
paper (DOI: 10.1214/10-AOAS398DSUPPB; .zip). 
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